767 research outputs found

    Oil and Water? High Performance Garbage Collection in Java with MMTk

    Get PDF
    Increasingly popular languages such as Java and C # require efficient garbage collection. This paper presents the design, implementation, and evaluation of MMTk, a Memory Management Toolkit for and in Java. MMTk is an efficient, composable, extensible, and portable framework for building garbage collectors. MMTk uses design patterns and compiler cooperation to combine modularity and efficiency. The resulting system is more robust, easier to maintain, and has fewer defects than monolithic collectors. Experimental comparisons with monolithic Java and C implementations reveal MMTk has significant performance advantages as well. Performance critical system software typically uses monolithic C at the expense of flexibility. Our results refute common wisdom that only this approach attains efficiency, and suggest that performance critical software can embrace modular design and high-level languages.

    Beltway: Getting Around Garbage Collection Gridlock

    Get PDF
    We present the design and implementation of a new garbage collection framework that significantly generalizes existing copying collectors. The Beltway framework exploits and separates object age and incrementality. It groups objects in one or more increments on queues called belts, collects belts independently, and collects increments on a belt in first-in-first-out order. We show that Beltway configurations, selected by command line options, act and perform the same as semi-space, generational, and older-first collectors, and encompass all previous copying collectors of which we are aware. The increasing reliance on garbage collected languages such as Java requires that the collector perform well. We show that the generality of Beltway enables us to design and implement new collectors that are robust to variations in heap size and improve total execution time over the best generational copying collectors of which we are aware by up to 40%, and on average by 5 to 10%, for small to moderate heap sizes. New garbage collection algorithms are rare, and yet we define not just one, but a new family of collectors that subsumes previous work. This generality enables us to explore a larger design space and build better collectors

    Distilling the Real Cost of Production Garbage Collectors

    Get PDF
    Abridged abstract: despite the long history of garbage collection (GC) and its prevalence in modern programming languages, there is surprisingly little clarity about its true cost. Without understanding their cost, crucial tradeoffs made by garbage collectors (GCs) go unnoticed. This can lead to misguided design constraints and evaluation criteria used by GC researchers and users, hindering the development of high-performance, low-cost GCs. In this paper, we develop a methodology that allows us to empirically estimate the cost of GC for any given set of metrics. By distilling out the explicitly identifiable GC cost, we estimate the intrinsic application execution cost using different GCs. The minimum distilled cost forms a baseline. Subtracting this baseline from the total execution costs, we can then place an empirical lower bound on the absolute costs of different GCs. Using this methodology, we study five production GCs in OpenJDK 17, a high-performance Java runtime. We measure the cost of these collectors, and expose their respective key performance tradeoffs. We find that with a modestly sized heap, production GCs incur substantial overheads across a diverse suite of modern benchmarks, spending at least 7-82% more wall-clock time and 6-92% more CPU cycles relative to the baseline cost. We show that these costs can be masked by concurrency and generous provisioning of memory/compute. In addition, we find that newer low-pause GCs are significantly more expensive than older GCs, and, surprisingly, sometimes deliver worse application latency than stop-the-world GCs. Our findings reaffirm that GC is by no means a solved problem and that a low-cost, low-latency GC remains elusive. We recommend adopting the distillation methodology together with a wider range of cost metrics for future GC evaluations.Comment: Camera-ready versio

    Draining the Swamp: Micro Virtual Machines as Solid Foundation for Language Development

    Get PDF
    Many of today\u27s programming languages are broken. Poor performance, lack of features and hard-to-reason-about semantics can cost dearly in software maintenance and inefficient execution. The problem is only getting worse with programming languages proliferating and hardware becoming more complicated. An important reason for this brokenness is that much of language design is implementation-driven. The difficulties in implementation and insufficient understanding of concepts bake bad designs into the language itself. Concurrency, architectural details and garbage collection are three fundamental concerns that contribute much to the complexities of implementing managed languages. We propose the micro virtual machine, a thin abstraction designed specifically to relieve implementers of managed languages of the most fundamental implementation challenges that currently impede good design. The micro virtual machine targets abstractions over memory (garbage collection), architecture (compiler backend), and concurrency. We motivate the micro virtual machine and give an account of the design and initial experience of a concrete instance, which we call Mu, built over a two year period. Our goal is to remove an important barrier to performant and semantically sound managed language design and implementation

    Cooperative cache scrubbing

    Get PDF
    Managing the limited resources of power and memory bandwidth while improving performance on multicore hardware is challeng-ing. In particular, more cores demand more memory bandwidth, and multi-threaded applications increasingly stress memory sys-tems, leading to more energy consumption. However, we demon-strate that not all memory traffic is necessary. For modern Java pro-grams, 10 to 60 % of DRAM writes are useless, because the data on these lines are dead- the program is guaranteed to never read them again. Furthermore, reading memory only to immediately zero ini-tialize it wastes bandwidth. We propose a software/hardware coop-erative solution: the memory manager communicates dead and zero lines with cache scrubbing instructions. We show how scrubbing instructions satisfy MESI cache coherence protocol invariants and demonstrate them in a Java Virtual Machine and multicore simula-tor. Scrubbing reduces average DRAM traffic by 59%, total DRAM energy by 14%, and dynamic DRAM energy by 57 % on a range of configurations. Cooperative software/hardware cache scrubbing reduces memory bandwidth and improves energy efficiency, two critical problems in modern systems

    Using remote substituents to control solution structure and anion binding in lanthanide complexes.

    Get PDF
    A study of the anion-binding properties of three structurally related lanthanide complexes, which all contain chemically identical anion-binding motifs, has revealed dramatic differences in their anion affinity. These arise as a consequence of changes in the substitution pattern on the periphery of the molecule, at a substantial distance from the binding pocket. Herein, we explore these remote substituent effects and explain the observed behaviour through discussion of the way in which remote substituents can influence and control the global structure of a molecule through their demands upon conformational space. Peripheral modifications to a binuclear lanthanide motif derived from α,α′-bis(DO3 Ayl)-m-xylene are shown to result in dramatic changes to the binding constant for isophthalate. In this system, the parent compound displays considerable conformational flexibility, yet can be assumed to bind to isophthalate through a well-defined conformer. Addition of steric bulk remote from the binding site restricts conformational mobility, giving rise to an increase in binding constant on entropic grounds as long as the ideal binding conformation is not excluded from the available range of conformers

    Pretenuring for Java

    Get PDF
    Pretenuring is a technique for reducing copying costs in garbage collectors. When pretenuring, the allocator places long-lived objects into regions that the garbage collector will rarely, if ever, collect. We extend previous work on profiling-driven pretenuring as follows. (1) We develop a collector-neutral approach to obtaining object lifetime profile information. We show that our collection of Java programs exhibits a very high degree of homogeneity of object lifetimes at each allocation site. This result is robust with respect to different inputs, and is similar to previous work on ML, but is in contrast to C programs, which require dynamic call chain context information to extract homogeneous lifetimes. Call-site homogeneity considerably simplifies the implementation of pretenuring and makes it more efficient. (2) Our pretenuring advice is neutral with respect to the collector algorithm, and we use it to improve two quite different garbage collectors: a traditional generational collector and an older-first collector. The system is also novel because it classifies and allocates objects into 3 categories: we allocate immortal objects into a permanent region that the collector will never consider, long-lived objects into a region in which the collector placed survivors of the most recent collection, and shortlived objects into the nursery, i.e., the default region. (3) We evaluate pretenuring on Java programs. Our simulation results show that pretenuring significantly reduces collector copying for generational and older-first collectors. 1
    corecore